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"VIDEO CX3DING METHOD AND CORRESPONDING CODED SIGNAL" 



FIELD OF THE INVENTION 

The present Invention relates to the field of video compression and, for instance, 
to the video coding standards of the MPEG family (MPEG-1, MPEG-2, MPEG-4) and the mj- 
H.26X family (H.261, H.263 and extensions, H.26L). More specifically, this invention concerns an 
encoding method applied to a video sequence corresponding to successive scenes subdivided 
Into successive video object planes (VOPs) and generating, for coding all the video objects of 
said scenes, a coded bitstream constituted of encoded video data in which each data item is 
desaibed by means of a bitstream syntax allowing to recognize and decode all the elements of 
the content of said bitstream, said content being described in tenns of separate channds. 

The invention also relates to a corresponding encoding device, to a transmittable 
video signal consisting of a coded bitstream generated by such an encoding device, and to a 
device for receiving and decoding a video signal consisting of such a coded bitstream. 

BACKGROUND OF THE INVENTION 

In the first video coding standards (up to MPEG-2 and H.263}, the video is 
assumed to be rectangular and to be described in temns of a luminance channel and two 
chrominance channels. With MPEG-4, otiier channels have been introduced : the alpha channel 
(also referred to as the "arbitrary shape channel" in r^PEG-4 terminology), for describing the 
contours of the video objects, and, in a later version of I^PEG-4, additional channels enabling 
the transmission of contents lilce depth, disparity or transparency. The depth, for instance, can 
be used for the applications where navigation in 3D is enabled. The disparity channel is used for 
the applications for which two views of the content are required, so that said content can be 
displayed on a device enabling stereoscopic viewing. The transparency channel is required for 
the contents composed of different objects which may be superimposed (a transparency 
channel for an object may be opaque - the obiect te)cture then overwrites the texture of the 
other objects - or half-transparen^ the texture on the display then resulting from the blending 
of the texture of the objects). 

As defined In^ the MPEG-4-document w3056> "^Infonnation Technology - Coding of 
audio-visual objects - Part 2 : Visual", ISO/IECyjrci/SC29/WGll, Maul, USA, December 1999, 
part 6.2.3 Video Object Layer, the only way (In MPEG-4) to describe the additional channels like 
transparency or disparity or depth of a sequence is the use of the syntactic element 
"Vldeo_objedLlayer_shape_extensIon". The syntax and the semantic provided by MPEG-4 In 
onter to support the coding of additional channels via said element are given in pages 35*36 
and 110-112 of the document w3056 : 

(a) "video_objeclLlayerjverid" : this 4-bit code, defined In table 6-11, identifies the 
version number of the video object layer ; 
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(b) "vldeojobjedLlayer^shape" : this 2-bn: code, defined In table 6-14, Identifies 
the shape type of a video object layer ; 

(c) "video_objectLlayer_shape_extenslon" : this 4-bit code, defined in table V2-1, 
Identifies the number (up to 3) and type of auxiliary components that can be used (only a 

5 limited number of types and combinations are defined in said table, and more applications are 

possible by selection of the USER DERNED type). 

These syntax and semantic show that the support for the transmission of additional 
channels is only provided for objects having a shape. In case one wants to transmit the 
luminance and chrominance channels and one additional channel like the disparity of a 

10 r^rfrangiitar nhjprt , It can indeed be explained how MPEG-4 is suboptimal in terms of coding 

efficiency- In MPEG-4, the description of a rectangular object (Icnowing that It Is really 
rectangular since the code "vldeo_objectJayer_shape" is then equal to 00) requires to transmit 
the size of the rectangle in terms of width and height This description, which is given in the 
Video Object Layer syntax (see the five lines 25 to 30 of p.36 of tiie document), requires 31 

15 bits. When one wants to transmit additional channels like the depth channel or the disparit/ 

channel of a rectangular oyect with the MPEG-4 syntax, tiiere is no ottier means than to 
declare thte object as non rectangular by setting the code "vldeojobJedLlayerjshape" to 11 
(gre^cale). 

Once the object has been declared as being greyscale (although it is rectangular), 
20 the syntax forces to send bits describing the shape of the object, which Is done at the 

macroblocic level according to the syntax given in the document, p.52, § 6.2.6 Macroblock, lines 
1 to 6 of the table, and p.56, § 6.2,6.1 MB Binary Shape Coding, lines 1 to 5 of the table. As 
Indicated In p.128-129 of the document, bab_type Is a variable length code comprised between 
1 and 7 bits and provided for indicating the mding mode used for the binary alpha block of 16 x 
25 16 pixels, and the seven bab^types are depicted in table 6-26. Such a description leads, for aF 

pictures for instance, to a waste of bits at least 396 bits per frame (at least one bit per 
macroblock). For a 25 Hz QF sequence, the overhead Is estimated at 9,9 kbits/s. 



SUMMARY OF THE INVENTION 

It is therefore an object of the Invention to propose a video coding method 
30 allowing to avoid this waste of bits and therefore to Improve the coding efficiency. 

To this end, the invention relates to a method such as defined in the Introductory 
part of the description and which is moreover characterized in that said syntax comprises a 
specific Information Indicating at a high description level In the bltsbieam the presence, or not, 
of the various channels that can be encountered to describe the content of the bitstream. 
35 Preferably, said specific infomnation consists of the following additional syntax 

elements : 

video_objectJayer_shape : 1 bit 

number_of_video_objectJayer_additional_channeLdescriptions : n bits 
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videoL.objectJayer_additionaLchannels [Q 1 bit 

the first element indicating the presence, or not, of a contour or shape channel that should then 
be decoded, the second one representing the number of additional channel syntax elements 
present in the coded bitstream in order to describe the content of said bitstream, and the third 
5 one identlf/ing the presence, or not, of the channel addressed by the value [i], i taking a value 

between 0 and 2"-l. 

In another embodiment of the Invention, said specific infomiatlon consists of the 
follovyring additional syntax elements : 

video_objectJayer_shape : 1 bit 

10 number_of_video_objectJayerjadditionaLchanneLpresence : n bits 

vldeo_objectJayer_addltIonaLchanneIs [I] 1 bit 

the first element indlcaUng the presence, or not, of a contour or shape channel that should then 
be decoded, the second one representing the number of additional channels present In the 
coded bitstream, and the third one identifying the presence, or not, of the channel addressed by 
15 the value [i], i taking a value bebween 0 and 2M. 

In a third embodlm^, said specific information consists of the following additional 

syntax elements : 

videojol3jeclLlayer_shape : 1 bit 

video_objectJayer_addltionaLchannels [Q 1 bit, 0<=l<= 2"-l 

20 the first element Indicating the presence, or not, of a contour or shape channel that should then 

be decoded, the second one Identifying the presence, or not, of the channel addressed by the 
value [I], I taking a value between 0 and 2"-l. 

With anyone of these three solutions, the vldeo_objectJayer_shape syntax element 
may be not provided in the bitstream. 

25 The invention also relates to a device for encoding a video sequence corresponding 

to successive scenes subdivided Into suo^ive video object planes (VOPs), said device 
comprising means for sbucturing each scene of said sequence as a composition of video objects 
(VOs), means for coding the shape, the motion and the texture of each of said VOs, and means 
for multiplexing the coded elementary streams thus obtained Into a single coded bitstream 

30 constituted of encoded video data In which each data item Is described by means of a bitstream 

syntax allowing to recognize and decode all the elements of the content of said bitstream, said 
content being described in terms of separate channels, said device being further characterized 
in that it also comprises means for Introducing into said coded bistream a specific Information 
indicating at a high description level In this coded bitstream the presence, or not, of various 

35 additional channels that can be encountered to describe the content of said bitstream. 

The invention also relates to a transmittable video signal consisting of a coded 
bitstream generated by an encoding method applied to a sequence corresponding to successive 
scenes subdivided into successive video object planes (VOPs), said coded bitstream, generated 
for coding all the video objects of said scenes, being constituted of encoded video data in which 



each data item is described by means of a bftstream syntax allowing to recognize and decode 
all the elements of the content of said bitstream, said content being described in temis of 
separate channels, said signal being further characterized in that said coded bitstream also 
comprises a specific Information indicating at a high description level in this coded bitstream the 
presence, or not, of various additional channels that can be encountered to describe the content 
of said bitstream. 

The invention finally relates to a device for receiving and decoding a video signal 
consisting of a coded bitstream generated by an encoding method applied to a video sequence 
corresponding to successive scenes subdivided into successive video object planes (VOPs), said 

10 codedMstrea m; ge n e ra t ed f or coding all the viripo ohjert s of said scene s, bei n g c o n stitute d-of- 

encoded video data in which each data item is described by means of a bitstream syntax 
allowing to recognize and decode all the elements of the content of said bitstream, saM content 
being described in tenns of separate channels, said coded bitstream moreover comprising a 
specific Information indicating at a high description level in this coded bitstream the presence, 
or not, of various additional channels that can be encountered to describe the content of said 
bitstream. 

DETAILED DESCRIPTION OF THE INVENTION 

To solve the problem of waste of bits eq^lained above, R: is proposed, 
according to the Invention, to Introduce into the coded bitstream an indication about the 
possible presence of additional channels. This Indication consists of a specific information 
introduced, according to the invention, at a high description level at least equivalent to 
the Video Object Layer (VOL) MPEG-4 level. 

This additional descriptive step Is implemented for example as now 
Indicated. The following syntactic elements are defined : 

(a) "video_objectJayer_shape" : 1 bit 

(b) "number_of_vldeo_objectJayer_additionaLchannel_descrlptions" : n bits 

(c) *VIdeo_objectJayer_additlonaLchannel [i] : 1 bit 
and the semantic meaning of these elements is : 

(a) vldeo_objectL.layer_shape : this 1-bit flag Indicates the presence of a shape (or 
contour) channel (If set to one, the contour channel Is present and should be decoded, while no 
description of shape or contour is expected if it Is not) ; 

(b) numberjofjvldeo_oyedLlayer_additionaLchanneLdescriptions : this n-bit 
unsigned Integer represents the number of additional channel syntax elements present in the 
coded bitstream ; 

(c) addit!onal_channeLnumber : this Integer takes values comprised between 0 
and number_ofjvideojobjedL-layer_additlonaLchanneLdescriptions ; 

(d) vldeo_objedL.layer_additionaLchannel [addiOonaLchannel^number] : 



this 1-btt flag Identifies the presence or not of the channel addressed by the value [i] of 
addftlonaLchannel^number. 

The correspondences between video_objectLlayer_additionaLchanne! 
[additionaLchanneLnumber] and the semantic of the related channel are given In the following 
table, for values 1 to 2" of number_pf_vfdeo_objedLlayer_additionaLchanneLdescriptions, 
called NAC in the table (n-4 in the given example) : 



AdditionaLchannd_number 


Semantic 


No.of bits 


NAC 


0 


video_objeclLlayerJum 




1 


1 


video_objectjayer_transparency 




2 


2 


video_objedLlayer_disparlty 




3 


3 


vldeo_objectJayer_texture 




4 


4 


video_obJectLlayer_depth 




5 


5 


user_defined 




6 


6 


userjdeflned 




7 


7 


userjd^ned 




8 


8 


userjd^ned 




9 


9 


userjdeflned 




10 


10 


userjdeflned 




11 


11 


userjdeflned 




12 


12 


userjdeflned 




13 


13 


userjdeflned 




14 


14 


userjdeflned 
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user_defined 







The proposition according to the invention leads therefore to a modified version of 
the syntax for Video^objectjayer. In page 36 of the document w3056, the following syntactic 
elements are added (lines 15 and following) : 



vldeojobjectjayer_shape 


1 


Uimsbf 


If (vldeo_objectL.layer_verid > 2){ 






number_ofjvldeo_ob]ectLlayer_additionaLchanneLdescriptions 


n 


Uimsbf 


for 0=0 ; j< numberjof_video_objectJayer_additionaL 

channeLdescriptlons, j++) 






videojobjectJayer_additlonaljchannelsD] 


1 


uimsbf 


} 







Examples of implementation (diannel presence description + corresponding 
syntax) for various types of objects may be given, the syntax element which Indicates the 
presence of chrominance channels being decoded only if the presence of a luminance channel 
has been indicated In the bitstream : 

(a) a coloured 4:2:2 rectangular sequence : 

vldeo_objectjayer_shape : 0 
number_of_video_objectJayecaddltionaLchanneLdescriptions : 1 

vldeo_objectJayerJum : 1 
video_objectJayer_chrom : 1 
10 (h) a hfarif-anri-whi te sranft with an opaque ob ^eff having a contour but no 



texture 



video_objectJayer_shape : 1 

number_ofjvideo_objectJayer_additionaLchanneljdescriptions : 0 

(c) a 4:2:2 black-and-white object having an opaque shape (or contour) : 
videojobjectjayerjshape : 1 
number_ofjvldeo_obJeclLlayerjaddiHonaLchanneLdescriptlons : 1 

video_objectJayerJum : 1 

vldeo_objectjayer_chrom : 0 

(d) a coloured 4:2:2 rectangular object having a transparent alpha plane : 
vldeo^olijectjayerjshape : 0 
number_of_video_objecLlayer_addltional_channeLdescriptions : 2 
vIdeo„objectJayerJum : 1 

video_objectJayer_chrom : 1 

v[deo_objectjayer_transparency : 1 

(e) a 4:2:2 rectangular object with its depth : 
vldeo_objectJayer_shape : 0 
number_ofjvideo_objectJayerjadditlonaLchanneLdescriptions : 5 

videojDbjectJayerJum : 1 

vldeo_objectJayer_chrom : 1 

vtdeo^objedLlayerJtransparency 0 

vldeo_obJedL.layer_dlsparity 0 

vldeo^objedLlayerjtexture 0 

vldeo_objectL.layer_depth 1 



The two following alternative syntaxes may also be proposed: 
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vldeo_objeclLfayer_shape 


1 


uimsbf 


if {vfdeo_obiectLlayer_vei1d > 2) { 






numberjofj^IdeojobjedJayerjaddiBonaLchanneL^ 


n 


uimsbf 


j = 0; 






k = 0; 






vVhilea<number_ofjtfideo_olqecUayer_?dditk^ 






{ 






j = j + vldeo_objedL.layer_addltfonaLchannels[k]; 


1 


uimsbf 


k = k + 1; 






} 






} 
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vIdeo_objectJayer_shape 


1 


uimsbf 


If (video_objedLlayer„verid > 2) { 






number_of js/ldeo_obJeclL.layer_additlon^^ = 2"; 






for 0=0 ; j<rnjmber_ofj»rtdeo.obiec^ 






video_obJedLlayer_additionaLchannelsD] 


1 


uimsbf 


} 
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CLAIMS : 

1. An encoding method applied to a video sequence corresponding to successive 
scenes subdivided into successive video object planes (VOf^) and generating, for coding all the 
video objects of said scenes, a coded bitstream constituted of encoded video data In which each 
data Item is described by means of a bitstream syntax allowing to recognize and decode all the 
elements of the cnntent of said bitstream, said content being described In terms of separate 
channels, said method being further diaracterlzed In that said syntax comprises a specific 
information indicating at a high description level In said coded bitstream the presence, or not, of 
various additional channels that can be encountered to describe the content of said Utstream. 

2. A method aa»rding to dalm 1, In which said specific Information consists of 
the following additional syntax elements : 

vldeo_objecUayer_shape : 1 bit 

number_ofjvldeo_obflecUayer_addltlonal_channeLdescripHons : n bits 

vldeo_objecUayer_addiaonaLchannels [i] : 1 bit 

the firet element Indicating the presaice, or not, of a contour or shape channel that should then 
be decoded, the second one representing the number of additional channel syntax elements 
present in the coded bitstream In order to describe the content of said bitstream, and the third 
one Identifying the presence, or not, of the channel addressed by tiie value [1], I taking a value 
between 0 and 2"-!. 

3. A method according to claim 1, in which said spedfk: InfomiaOon consists of 
the following additional syntax elements : 

vldeo_objectJayer_shape : 1 bit 

number_of_vldeo_objectJayer_additional_channel_presen<» : n bits 
video_objedL_layer_additionaLchannels [i] 1 Wt 

the first element indicating tt^e presence, or not, of a contour or shape channel that should tten 
be decoded, the second one representing the number of additional channels present In the 
coded bitstream, and the thirel one identifying the presence, or not, of the channel addressed by 
the value [i], I taking a value between 0 and 2M. 

4. A method accoitling to dalm 1, In which said specifk: Inlbrmation consists of 
the following additional syntax elements : 

vfcleo_objecUayerj5hape : 1 bit 

vldeo_objedL.layer_addl«onaLchannels [I] 1 bit, 0<=l<= 2"-l 

the first element Indkating tiie presence, or no!;, of a contour or shape diannel tiiat should then 
be decoded, and the second one identifying the presence, or not, of the channel addressed by 
the value [i], I taking a value bebA^een 0 and 2"-l. 

5. A method according to anyone of dalms 2 to 4, characterized In that the 
vkIeo_object.layer_shape syntax element is not provided In the bitstream. 

6 A device for encoding a video sequence corresponding to successive scenes 

subdivided Into successive video object planes (VOPs), said device comprising means for 



structuring each scene of safd sequence as a composition of video objects (VOs), means for 
coding the shape, the motion and the texture of each of said VOs, and means for multiplexing 
the coded elementary streams thus obtained Into a single coded bitstream constituted of 
encoded video data In which each data item is described by means of a bitstream syntax 
5 allowing to recognize and decode all the elements of the content of said bitstream, safd content 

being described in temis of separate channels, said device being further characterized in that it 
also comprises means for Introducing into safd coded bfstream a specific information indicating 
at a high description level in said coded bitstream the presence, or not, of various additional 
channels that can be encountered to describe the content of said bitstream. 

1Q Z. A transmitt able video signal mnsisting of a rnded bitst ream generatedJiVLaD — 

encoding method applied to a video sequence corresponding to successive scenes subdivided 
into successive video object planes (VOPs), said coded bitstream, generated for coding all the 
video objects of said scenes, being constituted of encoded video data in which each data item is 
described by means of a bitstream syntax allowing to recognize and decode all the elements of 

15 the content of said bitstream, said content being described in terms of separate channels, said 

signal being further characterized in that said coded bitstream also ramprises a specific 
information indicating at a high description level in said coded bitstream the presence, or not, of 
various additional channels that can be encountered to describe the a^ntent of said bitstream. 
8. A device for receiving and decoding a video signal consisting of a coded 

20 bitstream generated by an encoding method applied to a video sequence a>rresponding to 

successive scenes subdivided into successive video object planes (VOPs), said coded bitstream, 
generated for coding all the video objects of said scenes, being constituted of encoded video 
date In which each date item is described by means of a bitstream syntax allowing to recognize 
and decode all the elements of the content of said bitstream, said content being described in 

25 terms of separate channels, said coded bitstream moreover comprising a specific information 

indicating at a high description level in said coded bitstream the presence, or not, of various • 
additional channels that can be encountered to describe the content of said bitstream. 
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Abstract 

The invention i^lates to an encxxling method applied to a video sequence 
corresponding to successive scenes and generating a coded bilstream In which each 
data Item Is described by means of a bltstream syntax allowing, at the decoding 
5 side, to recognize and decode all the elements of the content of this coded 

bitstream. According to the Invention, said syntax comprises a specific Information 
Indicating at a high description level in said bitstream the presence, or not, of 
various additional channels that can be encountered to describe the content of said 
bitstream. Several examples of specific information are given. 
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